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(57) Abstract: A system and method of performing genomic 
research via online analysis of genomic data objects. A 
novel technical information model and implementation 
facility combines genomic visualization and analysis tools 
with the ability to link genomic data objects across (and 
within) plural databases. This provides for interactivity that 
is not possible using conventional Internet access processes 
and also provides for data integration that further enhances 
the effectiveness of genomic research. A business process 
controls how the databases are accessed, particularly the 
commercial fee-for-access databases, so that the researcher 
using a system according to the invention need not be 
concerned with keeping track of access limitations. 
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INTEGRATED ACCESS TO BIOMEDICAL RESOURCES 



BACKGROUND OF THE INVENTION 
1. Field of the Invention 

The present invention relates generally to the field of data processing. More 
particularly, the present invention relates to the use of the Internet to access a variety of 
biomedical data resources in an integrated fashion, in order to provide for increased 
efficiency in genomic research. 
2. Background Information 

Almost every cell of every living organism contains a complete set of instructions 
for creating that organism and regulating its cellular structures and activities over its 
lifetime. That set of instructions is called a genome. 

A genome is organized into distinct, microscopic units called chromosomes. 
Chromosomes are coiled threads of deoxyribonucleic acid (DNA). Each thread of DNA is 
composed of two long chains of nucleotides bound together in pairs to form a double 
helix. The human genome is made up of three and a half billion of these nucleotide pairs. 

A given DNA strand contains the cells' instructions for producing proteins. These 
instructions are in the form of specific sequences of nucleotide bases, called genes, within 
the DNA strand. Scientists estimate that 80,000 to 100,000 of these basic units of heredity 
exist within the human genome. Proteins perform a wide variety of physiological tasks. 
They facilitate processes such as digestion, breathing, immune responses, the production 
of heat and energy, and the movement of fluids in and out of cells. 

Most members of a species have the same collection of genes. However, each 
individual's unique characteristics stem from slight variations in the sequence of the 
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1 nucleotides that comprise the genes of that individual. These slight genetic variations that 

2 define unique characteristics of individuals are called polymorphisms. On average, the 

3 DNA of any two individuals in a species will differ by about 0.1%. 

4 Another class of variations - called mutations — also occurs. Both polymorphic 

5 and mutagenic variations may be harmful to an individual by inhibiting the production, or 

6 altering the normal function, of a protein. Most diseases result from these types of genetic 

7 variations. 

8 Genomics is the study of the nucleic sequences within a genome. The goal of 

9 genomic inquiry is to identify the sequence of nucleotides, understand the function of 
10 every gene they comprise, and clarify the genetic variations that define individuality and 
1 I create disease. 

12 Genomics has a broad scope of applications. These range from the most basic of 

1 3 research endeavors to the promise of diagnostic usefulness. 

14 An important factor limiting the development of new drugs is the limited number 

1 5 of know n target molecules for which new drugs can be developed. Disease target 

16 molecules arc those that can be affected by a drug and cause a subsequent, desired 

17 biological reaction in the body. Historically, the process of discovering new target 

18 molecules has been extremely slow and very expensive due to reliance on trial-and-crror 

19 approaches to discovery. Genomic research will reduce the reliance on trial-and-error by 

20 enabling drug designers to go directly to target molecules of interest. Thus, applying 

2 1 genomic research to drug development should produce new and better drugs more quickly, 

22 and at a reduced cost. 

23 Another way that genomic research can help the pharmaceutical industry is in the 



emerging field of pharmacogenomics. Pharmacogenomics focuses on identifying genetic 
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variation among patients te may aflto the efficacy of ^ _ ^ ^ ^ 

individual, body absorbs and metres a specific drug - in order ,„ develop more 
personated drug therapies. Nearly all drug companies are developing 
Pharmacogenomics units as a reaction M „, creasi „ g evi(W ,„„ . ^ ^ 

have the same effect on all people. 

'n particular, pharmacogenomics is believed to offer at least three different useful 
applications: 

• Increasing the success rate of clinical trials by improving the process of patient 
population selection; 

• Identifying new uses for existing drugs; and 

• Rescuing drugs that have failed previous drug trials by identifying more 
appropriate populations for using the drug. Candidates for drugs to be rescued 
include those that produce adverse reactions in particular sub-populations. 
Molecular toxicology is another area of technology that can benefit from genomics 

research. Approximately 2.2 mil.ion Americans are admitted to hospitals every year as a 
result of adverse side effects from drugs. Over ,00,000 Americans die annually from 
these adverse (and often unpredictab.e) effect, For instance, some cause liver damage 
while others are harm*, to the kidneys. Organ-specific gene expression profiles for drugs 
already available wil, enable researchers to study the toxicity of new drug compounds with 
more certainty. 

In addirion, gene expression data, combined wi,h polymorphism information 
related to metabolic pathway, will provide important indications of the way an individual 
patient wi„ react to drugs of various dosage levefs, thereby significantly reducing the 
unwanted side effects of therapy. 



WO 01/55911 



• 



T/US01/02527 



2 
3 
4 
5 
6 
7 
8 
9 
10 
1 1 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 



Risk assessment is a major area of diagnostics that will benefit from genomics. 
Historically, prediction of whether someone is at special risk for a particular disease has 
focused on measuring general indicators in the body, such as blood pressure and 
cholesterol levels. These measurements reflect general physiology but do not explain the 
specific genetic basis of disease in an individual patient. Consequently, these diagnostic 
tests do not discern the underlying cause of disease and can result in compromised medical 
care for patients and increased risk of litigation. 

New genomic-based diagnostics will focus on determining an individual's risk of 
developing a particular disease by looking at specific genes and any disease-related 
changes in that patient. These new diagnostics will likely lead to far better preventive care 
by offering more accurate assessments of a patient's potential risk for developing a 
particular disease. 

Personalized medicine is another major area of diagnostics that will benefit from 
genomics. Genomic information will be available to develop molecular diagnostic tests to 
identify the genetic make-up of individuals. These diagnostic tests will revolutionize 
medicine by enabling physicians to establish therapies designed for each patient, i.e., 
personalized medicine. 

For example, many types of cancer that are distinct at the cellular level 
nevertheless have similar symptoms. Because symptoms may be similar in one genetic 
type of cancer and another, it is important to know everything possible about cancer genes 
and their interactions in prescribing an effective treatment. 

As another example, physicians will be able to use a molecular/genomic test to 
help select the most effective drug with the minimum number of side effects. As a result, 
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this approach should benefit the patient with more cust 0mize d care, reduced length of 
illness, and, ultimately, a better and longer life. 

Besides healthcare, the field of agriculture is a.so likely to benefit from genomic 
research. The ability to diagnose plant and animal diseases and develop treatments 
targeted against those diseases should produce better agricu.tura. products and improve 
yields. For example, the comparison of genetic information from disease-or pest-resistant 
Plant strains with non-resistant strains and the use of selective breeding programs for 
favorable traits will significantly increase the number and success of new strains available 
to various agricu.tura, areas around the world. This has major implications for not only 
increasing the quantity of food but also its nutritional quality. 

Other fields that will likely derive important benefits from genomic information 
include forensics. veterinary medicine, textile production, waste control, and 
environmental remediation. 

A significant impediment to achieving any of the foregoing expected beneficial 
results of genomic research is the sheer size of the amount of genomic information to be 
sifted through and studied. Without exaggeration, the amount of raw nucleic sequencing 
data a,ai,ab.e to be- sifted through and studied is unimaginably vast. Moreover, the body 
of data gets bigger every day (literally), since newly sequenced strands of DNA are 
documented on an ongoing basis. Because of the vastness of the genomic data, it is stored, 
handled, and manipulated via computers. 

Bioinformatics is the use of computers to retrieve, process, and ana.yze biological 
information. This field of data processing is now considered essentia, for drug discovery 
and development. Scientists are augmenting traditiona. "wet" bio.ogy with quantity 
analyses, database comparisons, and computational algorithms. In this way, bio.ogy 
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research, at least preliminarily, is conducted in a virtual environment before the scientist 
sets foot in the laboratory. Bioinformatic tools and services assist pharmaceutical and 
biotechnology researchers with all phases of drug discovery and development including 
gene discovery, understanding disease pathways, identifying new disease targets and the 
discovery and correlation of gene sequence variation to disease. 

Unfortunately, conventional means of access to genomic information does not 
provide for comprehensive and easy access so that the data can be analyzed or studied in a 
computationally transparent manner. Genomic information is accumulated in a wide 
variety of databases, some public (freely available), some commercial (available for a 
price), and some proprietary ("in-house" resources that are not shared with outsiders). 

Referring to Fig. 1, genomic databases, both public 102 and private 104, as well as 
search tools 106 are available online. A user 110 uses an interface device 112 to access 
the databases 102, 104 and search tools 106. The interface device 1 12 communicates with 
a data access portal site 120 via an Internet connection 130. The portal 120 makes 
connections to the databases 102, 104 and search tools 106 via the Internet 140. Another 
operational mode for the user 1 10 to access the genomic research resources 102, 104, 106 
is serially via an Internet connection without using the portal 120. (This simplified mode 
of connection is not illustrated.) The interface device 112 is typically an implementation 
of a thin client browser application. 

Each database is made up of data objects that contain biological data. The data 
objects differ from one database to the next in terms of the types of biological data they 
contain, and in terms of their formats. Thus, studying data from plural databases requires 
the researcher to learn how to interpret the data as presented in the unique data structures 
and data contents of each particular database. This is a significant inconvenience. 
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Thus, what is needed is a bioinformatics visualization too. that will automatically 
interpret data from diverse genomic databases (each containing genomic data objects of 
varying formats) so that it is presented to a user in a predictab.e, easily-recognizable 



Additionally, because of the disparate storage formats used by the different 
databases, many of the data objects to be analyze, using automated analysis tools must be 
converted from their native format into a format that is recognizable to the automated 
analysis tool to be used. This is a further inconvenience. 

Thus, what is needed is a bioinformatics software too. that will automatically 
translate data from diverse genomic databases (each containing genomic data objects of 
varying formats) so that it is presented to ana.ysis facilities according to a uniform format. 

Furthermore, when it is discovered that a pair of data objects has a significant 
relationship to one another, there is no conventional mechanism (other than manually 
scribbling a note to oneself) for estab.ishing a .inking relationship between them. Such a 
mechanism is most conspicuously absent in the case where the pair of data objects are 
found in two entirely different databases. 

Thus, what is needed is a software facility that enables a user to estab.ish linking 
relationships between data objects, even when those data objects are drawn from diverse 
databases and have diverse formats and types of content. 

Moreover, the existing way for accessing and navigating Hyper Tex, Markup 
Language (HTML) genomic data files provided by .nternet database hosts or by a 
centralized Internet portal has crucial limitations. For one thing, there is no interactivity 
possible for manipulation and/or ana.ysis of the data. This interactivity is crucial in terms 
24 of research effectiveness. 
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Thus, what is needed is a software facility that enables a user to interactively 
access and navigate genomic data flies over the Internet, whether those flies be in the form 
of HTML (as is typically done) or in other formats the data may appear. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to provide a bioinformatics visualization tool 
that will automatically interpret data from diverse genomic databases (each containing 
genomic data objects of varying formats) so that it is presented to a user in a predictable, 
easily-recognizable format. 

It is another object of the present invention to provide a bioinformatics software 
process and system that will automatically translate data from diverse genomic databases 
(each containing genomic data objects of varying formats) so that it is presented to 
analysis facilities according to a uniform format. 

It is yet another object of the present invention to provide a software process and 
system that enables a user to establish linking relationships between data objects, even 
when those data objects are drawn from diverse databases and have diverse formats and 
types of content. 

It is still another object of the present invention to provide a software process and 
system that enables a user to interactively access and navigate HTML (or other format) 
genomic data files over the Internet. 

It is a further object of the present invention to provide a business process for 
enabling seamless access and interactivity with plural genomic databases, even in the case 
where one or more of those databases is a commercial (i.e., fee for access) database. 

Some of the above objects are made possible by a data processing system that is in 
electronic communication with a local genomic database system and with one or more 
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remote genomic database systems. The data processing system includes a graphica. user 
interface that enab.es a user to view genomic data objects graphically. ,t a.so includes a 
genomic data object linker and a linkable data object reso.ver that reso.ves one or more 
genomic data objects, which are .inkab.e with respect to a subject genomic data object, 
from among data objects found in the .oca, genomic database system and the one or more 
remote genomic database systems. A reso.ved genomic data object that is reso.ved by the 
«inkab.e data object reso.ver is .inked to the subject genomic data object by the data object 
linker, so that the reso.ved genomic data object and the subject genomic data object are 
each provided to the graphical user interface. 

Some of the above objects are a.so made possible by a system for performing 
genomic research that is in electronic communication with a .oca. genomic database 
system and with one or more remote genomic database systems. The system inc.udes a 
means for presenting a user with a graphica. view of genomic data objects, as we., as a 
means for .inking genomic data objects to one another. The system further inc.udes a 
means for reso.ving a genomic data object with respect to a subject genomic data object, 
from among genomic data objects found in the local genomic database system and the one 
or more remote genomic database systems. A reso.ved genomic data object that is 
reso.ved by the means for reso.ving is .inked to the subject genomic data object by the 
means for .inking, so that the reso.ved genomic data object and the subject genomic data 
object are each provided to the means for printing. 

Another way that some of the above objects are made possib.e is by a method of 
performing genomic research with respect to a subject genomic data object, using a .oca. 
genomic database and one or more remote genomic databases as resources. The method 
incudes the act of reso.ving a .inkab.e genomic data object, with respect to the subject 
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1 genomic data object, from among the local genomic database and the one or more remote 

2 genomic databases, regardless of the data formats of the genomic data objects. 

3 Additionally, the method includes the act of linking the subject genomic data object with 

4 the linkable genomic data object to form a set of linked genomic data objects. 

5 Furthermore, the method includes the act of storing the set of linked genomic data objects 

6 in the local genomic database. 

7 Still another way that some of the above objects are made possible is by a 

8 computer system for use in genomic research that implements a method as described 

9 above. 

10 One of the above objects is made possible by a novel technical information model 

1 1 and implementation facility. This is accomplished by combining genomic visualization 

1 2 and analysis tools with the ability to link genomic data objects across (and within) plural 

1 3 databases. 

14 Another of the above objects is made possible by a method of administering access 

15 to a plurality of genomic databases, where the genomic databases include a local genomic 

16 database, a public genomic database, and a commercial genomic database. The method 

1 7 includes a step of resolving linkable data objects with respect to a subject data object, and 

1 8 a further step of linking the resolved data objects to the subject genomic data object. The 

19 linkable data objects that are resolved from public genomic databases are resolved 

20 regardless of the data formats of genomic data objects stored therein, and without 

2 1 restriction as to access costs. The linkable data objects that are resolved from commercial 

22 genomic databases are resolved regardless of data formats of genomic data objects stored 

23 therein, and are resolved subject to applicable, predetermined access agreements for the 

24 commercial genomic databases. 
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Still another way that some of the above objects are made possible is by a 
computer program product for enabling a computer to administer access to a plurality of 
genomic databases. The plurality of genomic databases includes a local genomic 
database, a public genomic database, and a commercia. genomic database. The computer 
program product is software instructions for enabling the computer to perform 
predetermined operations, and a computer readable medium embodying the software 
instructions. The predetermined operations including the acts according to the methods 
discussed above. 

One aspect of the present invention is a local database for storing genomic data, 
such as nucleic acid sequences, amino acid sequences, oligonucleotides, resu.ts of Basic 
Local Aligned Search Too. (BLAST) searches, and entries from medical databases such as 
MEDLINE. 

Another aspect of the present invention is a visualization and analysis facility. 
Visualization and ana.ysis is provide, preferably via dialog boxes, so that parameters may 
be set for BLAST searches; so that text-based searching may be made of a sequence 
database, and so that bibliographic search may be done of a database. A linker is included 
for linking together nuc.eic acid sequences, amino acid sequences, BLAST search resu.ts, 
MEDLINE entries, etc. All types of genomic/biomedica. information are visua.ized via a 
graphical interface that incorporates viewer and editor components. 

Still another aspect of the present invention is an Internet connector having a 
programming module that resolves links between database objects. This connector looks 
either in the local database or in a remote Internet server to obtain a data object being 
sought. 
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1 BRIEF DESCRIPTION OF THE DRAWING 

2 Additional objects and advantages of the present invention will be apparent in the 

3 following detailed description read in conjunction with the accompanying drawing Figures. 

4 Fig. 1 illustrates a conventional configuration of using a browser to access Internet 

5 database resources via a portal. 

6 Fig. 2 illustrates integrated access to Internet database resources according to an 

7 embodiment of the present invention. 

8 DETAILED DESCRIPTION OF THE INVENTION 

9 One way to view the present invention is that the traditional approach of using a 

1 0 thin client browser and a thin portal (refer to Fig. 1 ) to provide a researcher with access to 

1 1 genomic research resources is abandoned. In its place, applicants have discovered the 

1 2 increased effectiveness of a system that provides an interactive interface with the 

13 resources. This interactivity is crucial for increasing the effectiveness of genomic 

14 research. Another way to view the present invention is as a technical information model 

1 5 and implementation facility. The interactivity aspect of the invention is provided by the 

16 combination of linking of data objects and its visualization and analysis aspects. 

1 7 In addition to a local database for storing intermediate and finalized results, the 

1 8 present invention has a visualization and analysis aspect. Visualization is provide in an 

19 advanced form that shows the sequences and other molecules in graphical presentations 

20 that are intuitively appealing to the human perception. In addition to interactivity, the 

21 ability for the user to easily integrate (i.e., link) data and store the integrated data in the 

22 local database is very helpful. 

23 An additional functionality that is provided in a preferred embodiment of the 

24 present invention are full-scale analysis tools and algorithms directly available at the user 
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interface rather than remotely over the Internet. Full-scale analysis allows DNA to be 
evaluated in view of, protein, enzyme, and oligos data sets, BLAST results, MEDLINE 

3 Entrez data, and amino acids. 

4 An example of a software product that can provide the visualization and analysis 
aspects of the present invention is the VectorNTI tm product of InforMax> |nc of North 
Bethesda, Maryland. 

Another aspect according to a preferred embodiment of the present invention is the 
use of a resolving system that reaches out to plural databases over the Internet (e.g., NCBI, 
9 Entrez, PubMed, SRS) to provide integrated database searches. 

Referring to Fig. 2, databases, both public 202 and private 204, and various 
research tools 206 are available online. Also available as a research resource are the 
previous research results and other proprietary data that user 210 stores in a local database 
system 208. As in the prior art, the Internet 220 is used as a communication medium to 
access the various remote resources 202, 204, 206. However, in contrast with the prior art, 
an entirely different set of tools is used for conducting research. 

The user 210 utilizes a user interface 230 that includes a visualization system 232 
and a data set linking facility 234 (hereinafter "linker" for short). The linker 234 provides 
for integration of data sets that the user deems to be worthy of being associated with one 
another for further study in relation to one another. Data sets so linked may be more 
closely examined or analyzed using analysis tools and algorithms 240. Examples of useful 
analysis tools and algorithms to include for use with the present invention are BioPlot™, 
AlignXTM, and ContigExpress™, which are all products of InforMax, Inc. of North 
Bethesda, MD. A number of other available analysis tools, such as BLAST may also be 
24 usefully employed. 
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1 This examination and analysis produces results that may themselves be linked to 

2 the data sets from which they were derived. The local database 208 is used to store the 

3 integrated data sets and results for later study by the user 210, or his or her colleagues. 

4 The later study may be in the form of additional computer analysis, or in a biology 

5 laboratory if the results are deemed to be sufficiently promising. 

6 Candidates for linking are identified by the linkable data object resolving system 

7 250. The linkable data object resolving system 250 connects via the Internet 220 to access 

8 any of various search tools 206 and databases 202, 204 to search for data objects that are 

9 relevant to a subject data object that the user 210 has identified as being of interest. The 

10 resolving system 250 does not establish links. Rather, the resolving system 250 identifies 

1 1 data objects from the enormously vast collections of data that are available for inspection 

12 over the Internet that should have a reasonable probability of being relevant to the subject 

1 3 data object. 

14 To limit and guide the searching by the resolving system 250, the permissioning 

1 5 and accounting module 260 directs the resolving system 250 to access only databases that 

16 are public 202 or those commercial databases 204 for which access agreements have been 

1 7 established. The accounting aspect of the permissioning and accounting module 260 

1 8 keeps records of access times, durations, and authorizations regarding the use of the 

1 9 commercial databasses 204. 

20 In addition to the apparatus aspects of the present invention, methods also form 

2 1 some aspects of the invention. A method of performing genomic research according to the 

22 present invention is accomplished, first, by resolving a linkable genomic data object, and 

23 then, by linking the linkable genomic data object with a subject genomic data object to 

24 form a set of linked genomic data objects. The resolving act is performed with respect to 
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the subject genomic data object, using one or more remote genomic databases as 
resources. Optionally, a local genomic database is also used in resolving a linkable 
genomic data object. The act of resolving is performed regardless of the data formats of 
the genomic data objects as they may be found in the various databases. After the 
genomic data objects are linked, they are preferably stored as a set of .inked genomic data 
objects in the local genomic database. 

Another aspect of the present invention is that it represents a business process 
wherein predetermined access agreements for commercial databases are used to guide 
research steps so that accessing of these databases is entirely seamless from the point of 
view of the researcher/user employing the process. This results in a process of 
administering access to a plurality of genomic databases, public, commercial, as well as 
local (possibly proprietary). The process includes a step of resolving linkable data objects 
with respect .o a subject data object, and a further step of linking the resolved data objects 
to the subject genomic data object. The linkable data objects that are resolved from public 
genomic databases are resolved regardless of the data formats of genomic data objects 
stored therein, and without restriction as to access costs. The linkable data objects that are 
resolved Iron, commercial genomic databases are resolved regardless of data formats of 
genomic data objects stored therein, and are resolved subject to applicable, predetermined 
access agreements for the commercial genomic databases. 

The present invention has been described in terms of preferred embodiments, 
however, it will be appreciated that various modifications and improvements may be made 
to the described embodiments without departing from the scope of the invention. The 
scope of the invention is limited only by the appended claims. 
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WHAT IS CLAIMED IS: 

1 . A data processing system that is in communication with a local database system 
and that is in communication with one or more remote database systems, the data 
processing system comprising: 

a graphical user interface enabling a user to view data objects graphically; 
a data object linker; 

a linkable data object resolver that resolves one or more data objects, which are 
linkable with respect to a subject data object, from among data objects found in the local 
database system and the one or more remote database systems; 

wherein a resolved data object resolved by the linkable data object resolver is 
linked to the subject data object by the data object linker, so that the resolved data object 
and the subject data object are each provided to the graphical user interface. 

2. A data processing system that is in electronic communication with a local 
genomic database system and that is in electronic communication with one or more remote 
genomic database systems, the data processing system comprising: 

a graphical user interface enabling a user to view genomic data objects graphically; 
a genomic data object linker; 

a linkable data object resolver that resolves one or more genomic data objects, 
which are linkable with respect to a subject genomic data object, from among data objects 
found in the local genomic database system and the one or more remote genomic database 
systems; 

wherein a resolved genomic data object resolved by the linkable data object 
resolver is linked to the subject genomic data object by the data object linker, so that the 
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12 resolved genomic data object and the subject genomic data object are each provided to the 

13 graphical user interface. 



3. The data processing system of claim 2, wherein the data processing system 
communicates with the one or more remote genomic database systems via a network. 



1 

2 

3 networks 



4. The data processing system of claim 2, wherein the data processing system 
communicates with the one or more remote genomic database systems via a network of 



5. The data processing system of claim 2, wherein the data processing system 
communicates with the one or more remote genomic database systems via the Internet. 

6. The data processing system of claim 2, wherein the resolved genomic data 
object and the subject genomic data object are each of a data type selected from the group 
consisting of: nucleic acid sequences, amino acid sequences, olgionucleotides, results of a 



4 BLAST search, and medical data. 



7. The data processing system of claim 6, wherein the resolved genomic data 
object and the subject genomic data object are of different data types. 

8. The data processing system of claim 2, wherein the local genomic database 
system and the one or more remote genomic database systems each contain data objects of 
types that are selected from the group consisting of: nucleic acid sequences, amino acid 
sequences, olgionucleotides, results of BLAST searches, and medical data. 



17- 



DOCID: <WO 0155911A1J_ 



WO 01/55911 



• 




T/US01/02527 



1 

2 
3 

1 

2 
I 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 

1 

2 
1 

2 



9. The data processing system of claim 2, wherein the graphical user interface has 
the capability to graphically depict nucleic acid sequences, amino acid sequences, 
olgionucleotides, results of BLAST searches, and medical data. 

1 0. The data processing system of claim 2, wherein the genomic data object linker 
links genomic data objects that are of differing data types. 

1 1. A system for performing genomic research, the system being in electronic 
communication with a local genomic database system and that is in electronic 
communication with one or more remote genomic database systems, the system 
comprising: 

means for presenting a user with a graphical view of genomic data objects; 

means for linking genomic data objects to one another; and 

means for resolving a genomic data object with respect to a subject genomic data 
object, from among genomic data objects found in the local genomic database system and 
the one or more remote genomic database systems; 

wherein a resolved genomic data object resolved by the means for resolving is 
linked to the subject genomic data object by the means for linking, so that the resolved 
genomic data object and the subject genomic data object are each provided to the means 
for printing. 

12. The system for performing genomic research of claim 1 1, wherein the system 
communicates with the one or more remote genomic database systems via the Internet. 

1 3. The system for performing genomic research of claim 1 1, wherein the resolved 
genomic data object and the subject resolved genomic data object are each of a data type 
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3 selected from the group consisting of: nucleic acid sequences, amino acid sequences, 

olgionucleotides, results of a BLAST search, and medical data. 



14. The system for performing genomic research of claim 13, wherein the resolved 
genomic data object and the subject resolved genomic data object are of different data 



3 types. 



1 

2 
3 
4 

5 medical data. 



15. The system for performing genomic research of claim 1 1, wherein the local 
genomic database system and the one or more remote genomic database systems each 
contain data objects of types that are selected from the group consisting of: nucleic acid 
sequences, amino acid sequences, olgionucleotides, results of BLAST searches, and 



16. The system for performing genomic research of claim 11, wherein the means 
for presenting has the capability to graphically depict nucleic acid sequences, amino acid 
sequences, olgionucleotides, results of BLAST searches, and medical data. 

1 7. The system for performing genomic research of claim 11, wherein the means 
for linking links genomic data objects that are of differing data types. 



1 8. A computer system adapted to genomic research with respect to a subject 
genomic data object, using a local genomic database and remote genomic databases as 
resources, the computer system comprising: 

4 a processor, and 

5 a memory, in electronic communication with the processor, including software 
instructions adapted to enable the computer system to perform the steps of: 
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resolve a linkable genomic data object, with respect to the subject genomic data 
object, from among the local genomic database and the remote genomic 
databases, regardless of the data formats of the genomic data objects; 

link the subject genomic data object with the linkable genomic data object to 
form a set of linked genomic data objects; and 

store the set of linked genomic data objects in the local genomic database. 



1 9. A method of performing genomic research with respect to a subject genomic 
data object, using a local genomic database and one or more remote genomic databases as 
resources, the method comprising: 

resolving a linkable genomic data object, with respect to the subject genomic data 
object, from among the local genomic database and the one or more remote genomic 
databases, regardless of the data formats of the genomic data objects; 

linking the subject genomic data object with the linkable genomic data object to 
form a set of linked genomic data objects; and 

storing the set of linked genomic data objects in the local genomic database. 

20. The method of performing genomic research of claim 19, wherein the resolved 
genomic data object and the subject genomic data object are each of a data type selected 
from the group consisting of: nucleic acid sequences, amino acid sequences, 
olgionucleotides, results of a BLAST search, and medical data. 

21. The method of performing genomic research of claim 20, wherein the resolved 
genomic data object and the subject genomic data object are of different data types. 
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1 22. The method of performing genomic research of claim 19, wherein the local 

2 genomic database system and the one or more remote genomic database systems each 

3 contain data objects of types that are selected from the group consisting of: nucleic acid 

4 sequences, amino acid sequences, olgionucleotides, results of BLAST searches, and 

5 medical data. 

1 23. The method of performing genomic research of claim 19, the method further 

2 comprising: 

3 providing a graphical user interface to graphically depict the subject genomic data 

4 object and the resolved genomic data object, regardless of whether they are nucleic acid 

5 sequences, amino acid sequences, olgionucleotides, results of BLAST searches, or medical 

6 data. 



1 24. The method of performing genomic research of claim 19, wherein the act of 

2 linking links genomic data objects that are of differing data types. 

1 25. A method of administering access to a plurality of genomic databases, the 

2 plurality of genomic databases including a local genomic database, a public genomic 

3 database, and a commercial genomic database, the method comprising: 

4 resolving one or more linkable data objects with respect to a subject data object; 

5 and 

6 linking the one or more resolved data objects to said subject genomic data object; 

7 wherein the one or more linkable data objects are resolved from public genomic 

8 databases, regardless of data formats of genomic data objects stored therein, and without 

9 restriction as to access costs; and 
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10 wherein the one or more linkable data objects are resolved from commercial 

1 1 genomic databases, regardless of data formats of genomic data objects stored therein, and 

1 2 subject to predetermined access agreements for the commercial genomic databases. 

1 26. The method of administering access to a plurality of genomic databases recited 

2 in claim 25, wherein the resolved genomic data object and the subject genomic data object 

3 are each of a data type selected from the group consisting of: nucleic acid sequences, 

4 amino acid sequences, olgionucleotides, results of a BLAST search, and medical data. 

1 27. The method of administering access to a plurality of genomic databases recited 

2 in claim 26, wherein the resolved genomic data object and the subject genomic data object 

3 are of different data types. 

1 28. The method of administering access to a plurality of genomic databases recited 

2 in claim 25, wherein the local genomic database system and the one or more remote 

3 genomic database systems each contain data objects of types that are selected from the 

4 group consisting of: nucleic acid sequences, amino acid sequences, olgionucleotides, 

5 results of BLAST searches, and medical data. 

1 29. The method of administering access to a plurality of genomic databases recited 

2 in claim 25, the method further comprising: 

3 providing a graphical user interface to graphically depict the subject genomic data 

4 object and the resolved genomic data object, regardless of whether they are nucleic acid 

5 sequences, amino acid sequences, olgionucleotides, results of BLAST searches, or medical 

6 data. 
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30. The method of administering access to a plurality of genomic databases recited 
in claim 25, wherein the act of linking links genomic data objects that are of differing data 



3 1 A computer program product for enabling a computer to perform genomic 
research with respect to a subject genomic data object, using a local genomic database and 
remote genomic databases as resources, the computer program product comprising: 

software instructions for enabling the computer to perform predetermined 



5 operations, and 

6 a 



computer readable medium embodying the software instructions; 
the predetermined operations including the acts of: 

resolve a linkable genomic data object, with respect to the subject genomic data 
object, from among the local genomic database and the remote genomic 
databases, regardless of the data formats of the genomic data objects; 
link the subject genomic data object with the linkable genomic data object to 

form a set of linked genomic data objects; and 
store the set of linked genomic data objects in the local genomic database. 



32. A computer program product for enabling a computer to administer access to a 
plurality of genomic databases, the plurality of genomic databases including a local 
genomic database, a public genomic database, and a commercial genomic database, the 

4 computer program product comprising: 

5 software instructions for enabling the computer to perform predetermined 

6 operations, and 



a computer readable medium embodying the software instructions; 
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8 the predetermined operations including the acts of: 

9 resolve one or more linkable data objects with respect to a subject data object; 
10 and 

\ 1 linking the one or more resolved data objects to said subject genomic data 

12 object; 

13 wherein the one or more linkable data objects are resolved from public 

14 genomic databases, regardless of data formats of genomic data objects 

1 5 stored therein, and without restriction as to access costs; and 

16 wherein the one or more linkable data objects are resolved from commercial 

17 genomic databases, regardless of data formats of genomic data objects 

1 8 stored therein, and subject to predetermined access agreements for the 

19 commercial genomic databases. 
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